Jetscii
A tiny library to efficiently search strings for substrings or sets of ASCII characters.
Examples
Searching for a set of ASCII characters
extern crate jetscii;
Searching for a substring
use Substring;
let colors: = "red, blue, green".split.collect;
assert_eq!;
What's so special about this library?
We use a particular set of x86-64 SSE 4.2 instructions (PCMPESTRI
and PCMPESTRM
) to gain great speedups. This method stays fast even
when searching for a character in a set of up to 16 choices.
When the PCMPxSTRx
instructions are not available, we fall back to
reasonably fast but universally-supported methods.
Benchmarks
Single character
Searching a 5MiB string of a
s with a single space at the end:
Method | Speed |
---|---|
str.find(AsciiChars) |
5719 MB/s |
`str.as_bytes().iter().position( | &v |
`str.find( | c |
str.find(' ') |
1085 MB/s |
str.find(&[' '][..]) |
602 MB/s |
str.find(" ") |
293 MB/s |
Set of characters
Searching a 5MiB string of a
s with a single ampersand at the end:
Method | Speed |
---|---|
str.find(AsciiChars) |
5688 MB/s |
`str.as_bytes().iter().position( | &v |
`str.find( | c |
str.find(&['<', '>', '&'][..]) |
361 MB/s |
Substrings
Method | Speed |
---|---|
str.find(Substring::new("xyzzy")) |
5017 MB/s |
str.find("xyzzy" | 3837 MB/s |
Contributing
- Fork it ( https://github.com/shepmaster/jetscii/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Add a failing test.
- Add code to pass the test.
- Commit your changes (
git commit -am 'Add some feature'
) - Ensure tests pass.
- Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request